Dataset statistics
| Number of variables | 12 |
|---|---|
| Number of observations | 1551 |
| Missing cells | 0 |
| Missing cells (%) | 0.0% |
| Duplicate rows | 212 |
| Duplicate rows (%) | 13.7% |
| Total size in memory | 145.5 KiB |
| Average record size in memory | 96.1 B |
Variable types
| Numeric | 12 |
|---|
| Dataset has 212 (13.7%) duplicate rows | Duplicates |
fixed acidity is highly correlated with citric acid and 2 other fields | High correlation |
volatile acidity is highly correlated with citric acid | High correlation |
citric acid is highly correlated with fixed acidity and 2 other fields | High correlation |
free sulfur dioxide is highly correlated with total sulfur dioxide | High correlation |
total sulfur dioxide is highly correlated with free sulfur dioxide | High correlation |
density is highly correlated with fixed acidity | High correlation |
pH is highly correlated with fixed acidity and 1 other fields | High correlation |
fixed acidity is highly correlated with citric acid and 2 other fields | High correlation |
volatile acidity is highly correlated with citric acid | High correlation |
citric acid is highly correlated with fixed acidity and 2 other fields | High correlation |
free sulfur dioxide is highly correlated with total sulfur dioxide | High correlation |
total sulfur dioxide is highly correlated with free sulfur dioxide | High correlation |
density is highly correlated with fixed acidity | High correlation |
pH is highly correlated with fixed acidity and 1 other fields | High correlation |
fixed acidity is highly correlated with pH | High correlation |
free sulfur dioxide is highly correlated with total sulfur dioxide | High correlation |
total sulfur dioxide is highly correlated with free sulfur dioxide | High correlation |
pH is highly correlated with fixed acidity | High correlation |
total sulfur dioxide is highly correlated with free sulfur dioxide | High correlation |
pH is highly correlated with fixed acidity and 1 other fields | High correlation |
density is highly correlated with fixed acidity and 2 other fields | High correlation |
fixed acidity is highly correlated with pH and 2 other fields | High correlation |
volatile acidity is highly correlated with citric acid | High correlation |
alcohol is highly correlated with density | High correlation |
free sulfur dioxide is highly correlated with total sulfur dioxide | High correlation |
residual sugar is highly correlated with density | High correlation |
citric acid is highly correlated with pH and 2 other fields | High correlation |
citric acid has 124 (8.0%) zeros | Zeros |
Reproduction
| Analysis started | 2021-08-22 13:15:25.103390 |
|---|---|
| Analysis finished | 2021-08-22 13:17:23.838464 |
| Duration | 1 minute and 58.74 seconds |
| Software version | pandas-profiling v3.0.0 |
| Download configuration | config.json |
| Distinct | 87 |
|---|---|
| Distinct (%) | 5.6% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 1.125608126 |
| Minimum | 0.9924381114 |
|---|---|
| Maximum | 1.2510185 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 12.2 KiB |
Quantile statistics
| Minimum | 0.9924381114 |
|---|---|
| 5-th percentile | 1.054038083 |
| Q1 | 1.092298335 |
| median | 1.120086909 |
| Q3 | 1.156454728 |
| 95-th percentile | 1.205981701 |
| Maximum | 1.2510185 |
| Range | 0.2585803887 |
| Interquartile range (IQR) | 0.06415639312 |
Descriptive statistics
| Standard deviation | 0.04602346171 |
|---|---|
| Coefficient of variation (CV) | 0.04088764168 |
| Kurtosis | -0.2804046955 |
| Mean | 1.125608126 |
| Median Absolute Deviation (MAD) | 0.03134674595 |
| Skewness | 0.1385741113 |
| Sum | 1745.818203 |
| Variance | 0.002118159028 |
| Monotonicity | Not monotonic |
Histogram with fixed size bins (bins=50)
| Value | Count | Frequency (%) |
| 1.09605244 | 67 | 4.3% |
| 1.092298335 | 57 | 3.7% |
| 1.11687459 | 53 | 3.4% |
| 1.106810998 | 51 | 3.3% |
| 1.088454951 | 50 | 3.2% |
| 1.113592807 | 49 | 3.2% |
| 1.129329785 | 45 | 2.9% |
| 1.080486385 | 45 | 2.9% |
| 1.110239128 | 45 | 2.9% |
| 1.103305739 | 44 | 2.8% |
| Other values (77) | 1045 |
| Value | Count | Frequency (%) |
| 0.9924381114 | 2 | 0.1% |
| 0.998928893 | 4 | 0.3% |
| 1.005214677 | 2 | 0.1% |
| 1.011305625 | 4 | 0.3% |
| 1.022940322 | 14 | |
| 1.028501217 | 1 | 0.1% |
| 1.033901667 | 4 | 0.3% |
| 1.03914895 | 9 | |
| 1.044249898 | 13 | |
| 1.049210929 | 16 |
| Value | Count | Frequency (%) |
| 1.2510185 | 2 | |
| 1.243076205 | 1 | 0.1% |
| 1.239470534 | 1 | 0.1% |
| 1.236994291 | 1 | 0.1% |
| 1.235733587 | 2 | |
| 1.231857549 | 1 | 0.1% |
| 1.230533183 | 3 | |
| 1.229192098 | 3 | |
| 1.226458395 | 3 | |
| 1.225065068 | 2 |
| Distinct | 140 |
|---|---|
| Distinct (%) | 9.0% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 0.5263120567 |
| Minimum | 0.12 |
|---|---|
| Maximum | 1.33 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 12.2 KiB |
Quantile statistics
| Minimum | 0.12 |
|---|---|
| 5-th percentile | 0.27 |
| Q1 | 0.39 |
| median | 0.52 |
| Q3 | 0.635 |
| 95-th percentile | 0.84 |
| Maximum | 1.33 |
| Range | 1.21 |
| Interquartile range (IQR) | 0.245 |
Descriptive statistics
| Standard deviation | 0.1755455183 |
|---|---|
| Coefficient of variation (CV) | 0.3335388504 |
| Kurtosis | 0.6144300092 |
| Mean | 0.5263120567 |
| Median Absolute Deviation (MAD) | 0.12 |
| Skewness | 0.5557783182 |
| Sum | 816.31 |
| Variance | 0.03081622901 |
| Monotonicity | Not monotonic |
Histogram with fixed size bins (bins=50)
| Value | Count | Frequency (%) |
| 0.6 | 46 | 3.0% |
| 0.5 | 44 | 2.8% |
| 0.43 | 43 | 2.8% |
| 0.59 | 38 | 2.5% |
| 0.58 | 38 | 2.5% |
| 0.36 | 37 | 2.4% |
| 0.4 | 36 | 2.3% |
| 0.39 | 35 | 2.3% |
| 0.38 | 34 | 2.2% |
| 0.56 | 34 | 2.2% |
| Other values (130) | 1166 |
| Value | Count | Frequency (%) |
| 0.12 | 3 | 0.2% |
| 0.16 | 2 | 0.1% |
| 0.18 | 8 | |
| 0.19 | 2 | 0.1% |
| 0.2 | 3 | 0.2% |
| 0.21 | 6 | |
| 0.22 | 6 | |
| 0.23 | 5 | 0.3% |
| 0.24 | 13 | |
| 0.25 | 7 |
| Value | Count | Frequency (%) |
| 1.33 | 2 | |
| 1.24 | 1 | 0.1% |
| 1.185 | 1 | 0.1% |
| 1.18 | 1 | 0.1% |
| 1.115 | 1 | 0.1% |
| 1.09 | 1 | 0.1% |
| 1.04 | 2 | |
| 1.035 | 1 | 0.1% |
| 1.025 | 1 | 0.1% |
| 1.02 | 3 |
| Distinct | 78 |
|---|---|
| Distinct (%) | 5.0% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 0.2705738233 |
| Minimum | 0 |
|---|---|
| Maximum | 0.78 |
| Zeros | 124 |
| Zeros (%) | 8.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 12.2 KiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 0 |
| Q1 | 0.1 |
| median | 0.26 |
| Q3 | 0.42 |
| 95-th percentile | 0.6 |
| Maximum | 0.78 |
| Range | 0.78 |
| Interquartile range (IQR) | 0.32 |
Descriptive statistics
| Standard deviation | 0.1924782171 |
|---|---|
| Coefficient of variation (CV) | 0.7113704302 |
| Kurtosis | -0.8911411064 |
| Mean | 0.2705738233 |
| Median Absolute Deviation (MAD) | 0.16 |
| Skewness | 0.2818176003 |
| Sum | 419.66 |
| Variance | 0.03704786406 |
| Monotonicity | Not monotonic |
Histogram with fixed size bins (bins=50)
| Value | Count | Frequency (%) |
| 0 | 124 | 8.0% |
| 0.49 | 65 | 4.2% |
| 0.24 | 48 | 3.1% |
| 0.02 | 47 | 3.0% |
| 0.26 | 38 | 2.5% |
| 0.1 | 35 | 2.3% |
| 0.21 | 33 | 2.1% |
| 0.08 | 33 | 2.1% |
| 0.32 | 32 | 2.1% |
| 0.01 | 32 | 2.1% |
| Other values (68) | 1064 |
| Value | Count | Frequency (%) |
| 0 | 124 | |
| 0.01 | 32 | 2.1% |
| 0.02 | 47 | 3.0% |
| 0.03 | 29 | 1.9% |
| 0.04 | 28 | 1.8% |
| 0.05 | 20 | 1.3% |
| 0.06 | 24 | 1.5% |
| 0.07 | 22 | 1.4% |
| 0.08 | 33 | 2.1% |
| 0.09 | 28 | 1.8% |
| Value | Count | Frequency (%) |
| 0.78 | 1 | 0.1% |
| 0.76 | 2 | 0.1% |
| 0.75 | 1 | 0.1% |
| 0.74 | 4 | |
| 0.73 | 3 | 0.2% |
| 0.72 | 1 | 0.1% |
| 0.71 | 1 | 0.1% |
| 0.7 | 2 | 0.1% |
| 0.69 | 4 | |
| 0.68 | 8 |
| Distinct | 85 |
|---|---|
| Distinct (%) | 5.5% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 0.5130771125 |
| Minimum | 0.1640382198 |
|---|---|
| Maximum | 0.809290928 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 12.2 KiB |
Quantile statistics
| Minimum | 0.1640382198 |
|---|---|
| 5-th percentile | 0.3607363954 |
| Q1 | 0.4500494896 |
| median | 0.5131574298 |
| Q3 | 0.5729387001 |
| 95-th percentile | 0.7073442049 |
| Maximum | 0.809290928 |
| Range | 0.6452527082 |
| Interquartile range (IQR) | 0.1228892104 |
Descriptive statistics
| Standard deviation | 0.1038069158 |
|---|---|
| Coefficient of variation (CV) | 0.2023222499 |
| Kurtosis | 0.5064301426 |
| Mean | 0.5130771125 |
| Median Absolute Deviation (MAD) | 0.0631079402 |
| Skewness | 0.1026886922 |
| Sum | 795.7826015 |
| Variance | 0.01077587576 |
| Monotonicity | Not monotonic |
Histogram with fixed size bins (bins=50)
| Value | Count | Frequency (%) |
| 0.4733815895 | 154 | 9.9% |
| 0.5131574298 | 128 | 8.3% |
| 0.49430184 | 125 | 8.1% |
| 0.4238773323 | 124 | 8.0% |
| 0.4500494896 | 117 | 7.5% |
| 0.5302327009 | 108 | 7.0% |
| 0.5457626608 | 84 | 5.4% |
| 0.5599431672 | 84 | 5.4% |
| 0.5729387001 | 79 | 5.1% |
| 0.3943310756 | 74 | 4.8% |
| Other values (75) | 474 |
| Value | Count | Frequency (%) |
| 0.1640382198 | 5 | 0.3% |
| 0.2256276523 | 4 | 0.3% |
| 0.2776890781 | 34 | 2.2% |
| 0.3222299619 | 29 | 1.9% |
| 0.3607363954 | 56 | |
| 0.3780886209 | 2 | 0.1% |
| 0.3943310756 | 74 | |
| 0.4095643368 | 2 | 0.1% |
| 0.4238773323 | 124 | |
| 0.4500494896 | 117 |
| Value | Count | Frequency (%) |
| 0.809290928 | 1 | |
| 0.8076153887 | 1 | |
| 0.8057977635 | 1 | |
| 0.795571235 | 1 | |
| 0.783871814 | 1 | |
| 0.7830309794 | 1 | |
| 0.7821692905 | 2 | |
| 0.78038026 | 1 | |
| 0.777519973 | 1 | |
| 0.7754844651 | 2 |
chlorides
Real number (ℝ≥0)
| Distinct | 142 |
|---|---|
| Distinct (%) | 9.2% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 0.08644487427 |
| Minimum | 0.038 |
|---|---|
| Maximum | 0.611 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 12.2 KiB |
Quantile statistics
| Minimum | 0.038 |
|---|---|
| 5-th percentile | 0.0555 |
| Q1 | 0.071 |
| median | 0.079 |
| Q3 | 0.09 |
| 95-th percentile | 0.122 |
| Maximum | 0.611 |
| Range | 0.573 |
| Interquartile range (IQR) | 0.019 |
Descriptive statistics
| Standard deviation | 0.04278465987 |
|---|---|
| Coefficient of variation (CV) | 0.4949357638 |
| Kurtosis | 46.14881628 |
| Mean | 0.08644487427 |
| Median Absolute Deviation (MAD) | 0.009 |
| Skewness | 5.977871288 |
| Sum | 134.076 |
| Variance | 0.00183052712 |
| Monotonicity | Not monotonic |
Histogram with fixed size bins (bins=50)
| Value | Count | Frequency (%) |
| 0.08 | 66 | 4.3% |
| 0.074 | 55 | 3.5% |
| 0.078 | 51 | 3.3% |
| 0.076 | 51 | 3.3% |
| 0.084 | 49 | 3.2% |
| 0.077 | 47 | 3.0% |
| 0.082 | 46 | 3.0% |
| 0.075 | 45 | 2.9% |
| 0.071 | 45 | 2.9% |
| 0.079 | 43 | 2.8% |
| Other values (132) | 1053 |
| Value | Count | Frequency (%) |
| 0.038 | 2 | |
| 0.039 | 4 | |
| 0.041 | 2 | |
| 0.042 | 3 | |
| 0.043 | 1 | 0.1% |
| 0.044 | 3 | |
| 0.045 | 3 | |
| 0.046 | 3 | |
| 0.047 | 4 | |
| 0.048 | 4 |
| Value | Count | Frequency (%) |
| 0.611 | 1 | 0.1% |
| 0.467 | 1 | 0.1% |
| 0.464 | 1 | 0.1% |
| 0.422 | 1 | 0.1% |
| 0.415 | 3 | |
| 0.414 | 1 | 0.1% |
| 0.413 | 1 | 0.1% |
| 0.403 | 1 | 0.1% |
| 0.401 | 1 | 0.1% |
| 0.387 | 1 | 0.1% |
free sulfur dioxide
Real number (ℝ≥0)
HIGH CORRELATIONHIGH CORRELATIONHIGH CORRELATIONHIGH CORRELATION| Distinct | 56 |
|---|---|
| Distinct (%) | 3.6% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 3.105653517 |
| Minimum | 0 |
|---|---|
| Maximum | 5.926269153 |
| Zeros | 3 |
| Zeros (%) | 0.2% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 12.2 KiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 1.535539362 |
| Q1 | 2.248370806 |
| median | 3.10733293 |
| Q3 | 3.827880953 |
| 95-th percentile | 4.628351204 |
| Maximum | 5.926269153 |
| Range | 5.926269153 |
| Interquartile range (IQR) | 1.579510147 |
Descriptive statistics
| Standard deviation | 0.9747761001 |
|---|---|
| Coefficient of variation (CV) | 0.313871491 |
| Kurtosis | -0.6438567568 |
| Mean | 3.105653517 |
| Median Absolute Deviation (MAD) | 0.7931477906 |
| Skewness | -0.04569342 |
| Sum | 4816.868605 |
| Variance | 0.9501884453 |
| Monotonicity | Not monotonic |
Histogram with fixed size bins (bins=50)
| Value | Count | Frequency (%) |
| 2.046205856 | 136 | 8.8% |
| 1.812859058 | 103 | 6.6% |
| 3.317115946 | 77 | 5.0% |
| 2.991877426 | 75 | 4.8% |
| 2.73384422 | 74 | 4.8% |
| 2.248370806 | 70 | 4.5% |
| 2.587813607 | 62 | 4.0% |
| 2.867881765 | 59 | 3.8% |
| 3.50420748 | 59 | 3.8% |
| 3.10733293 | 55 | 3.5% |
| Other values (46) | 781 |
| Value | Count | Frequency (%) |
| 0 | 3 | 0.2% |
| 0.7291977496 | 1 | 0.1% |
| 1.191008045 | 49 | 3.2% |
| 1.535539362 | 41 | 2.6% |
| 1.812859058 | 103 | |
| 1.934073038 | 1 | 0.1% |
| 2.046205856 | 136 | |
| 2.248370806 | 70 | |
| 2.427186101 | 55 | |
| 2.587813607 | 62 |
| Value | Count | Frequency (%) |
| 5.926269153 | 1 | 0.1% |
| 5.499075075 | 1 | 0.1% |
| 5.402252634 | 1 | 0.1% |
| 5.368955439 | 1 | 0.1% |
| 5.33511674 | 3 | |
| 5.300717059 | 4 | |
| 5.265735819 | 2 | |
| 5.193940326 | 2 | |
| 5.157078605 | 1 | 0.1% |
| 5.119540169 | 1 | 0.1% |
total sulfur dioxide
Real number (ℝ≥0)
HIGH CORRELATIONHIGH CORRELATIONHIGH CORRELATIONHIGH CORRELATION| Distinct | 142 |
|---|---|
| Distinct (%) | 9.2% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 3.943582802 |
| Minimum | 1.875722899 |
|---|---|
| Maximum | 5.828287448 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 12.2 KiB |
Quantile statistics
| Minimum | 1.875722899 |
|---|---|
| 5-th percentile | 2.549852854 |
| Q1 | 3.346596028 |
| median | 3.962830575 |
| Q3 | 4.550629496 |
| 95-th percentile | 5.325481222 |
| Maximum | 5.828287448 |
| Range | 3.952564549 |
| Interquartile range (IQR) | 1.204033468 |
Descriptive statistics
| Standard deviation | 0.838462075 |
|---|---|
| Coefficient of variation (CV) | 0.2126142945 |
| Kurtosis | -0.7006047324 |
| Mean | 3.943582802 |
| Median Absolute Deviation (MAD) | 0.6162345468 |
| Skewness | -0.005040227028 |
| Sum | 6116.496925 |
| Variance | 0.7030186512 |
| Monotonicity | Not monotonic |
Histogram with fixed size bins (bins=50)
| Value | Count | Frequency (%) |
| 3.630437137 | 43 | 2.8% |
| 3.448605795 | 36 | 2.3% |
| 3.11304496 | 35 | 2.3% |
| 2.902901322 | 35 | 2.3% |
| 2.823885796 | 33 | 2.1% |
| 3.23537255 | 33 | 2.1% |
| 3.751278659 | 32 | 2.1% |
| 3.994882181 | 31 | 2.0% |
| 3.587410737 | 30 | 1.9% |
| 3.398653799 | 30 | 1.9% |
| Other values (132) | 1213 |
| Value | Count | Frequency (%) |
| 1.875722899 | 3 | 0.2% |
| 2.045204945 | 4 | 0.3% |
| 2.193092181 | 14 | 0.9% |
| 2.324372176 | 13 | 0.8% |
| 2.442472718 | 27 | |
| 2.549852854 | 26 | |
| 2.64833766 | 29 | |
| 2.739319624 | 28 | |
| 2.823885796 | 33 | |
| 2.902901322 | 35 |
| Value | Count | Frequency (%) |
| 5.828287448 | 1 | 0.1% |
| 5.788447911 | 1 | 0.1% |
| 5.747408557 | 1 | 0.1% |
| 5.730639946 | 1 | 0.1% |
| 5.722177457 | 1 | 0.1% |
| 5.713661953 | 2 | |
| 5.696469138 | 1 | 0.1% |
| 5.687790413 | 2 | |
| 5.679055844 | 3 | |
| 5.661416185 | 3 |
| Distinct | 416 |
|---|---|
| Distinct (%) | 26.8% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 0.9967721599 |
| Minimum | 0.9912 |
|---|---|
| Maximum | 1.00289 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 12.2 KiB |
Quantile statistics
| Minimum | 0.9912 |
|---|---|
| 5-th percentile | 0.993855 |
| Q1 | 0.99565 |
| median | 0.99676 |
| Q3 | 0.99781 |
| 95-th percentile | 0.9998 |
| Maximum | 1.00289 |
| Range | 0.01169 |
| Interquartile range (IQR) | 0.00216 |
Descriptive statistics
| Standard deviation | 0.001729584635 |
|---|---|
| Coefficient of variation (CV) | 0.001735185536 |
| Kurtosis | 0.2683798665 |
| Mean | 0.9967721599 |
| Median Absolute Deviation (MAD) | 0.0011 |
| Skewness | 0.1123607415 |
| Sum | 1545.99362 |
| Variance | 2.991463009 × 10-6 |
| Monotonicity | Not monotonic |
Histogram with fixed size bins (bins=50)
| Value | Count | Frequency (%) |
| 0.9972 | 36 | 2.3% |
| 0.9968 | 35 | 2.3% |
| 0.9976 | 34 | 2.2% |
| 0.998 | 29 | 1.9% |
| 0.9962 | 27 | 1.7% |
| 0.9978 | 26 | 1.7% |
| 0.997 | 24 | 1.5% |
| 0.9982 | 23 | 1.5% |
| 0.9964 | 23 | 1.5% |
| 0.9966 | 23 | 1.5% |
| Other values (406) | 1271 |
| Value | Count | Frequency (%) |
| 0.9912 | 1 | 0.1% |
| 0.9915 | 1 | 0.1% |
| 0.99162 | 1 | 0.1% |
| 0.99191 | 1 | 0.1% |
| 0.9922 | 2 | |
| 0.99235 | 1 | 0.1% |
| 0.99236 | 1 | 0.1% |
| 0.9924 | 3 | |
| 0.99242 | 2 | |
| 0.99252 | 1 | 0.1% |
| Value | Count | Frequency (%) |
| 1.00289 | 1 | 0.1% |
| 1.0026 | 1 | 0.1% |
| 1.0022 | 2 | 0.1% |
| 1.0021 | 2 | 0.1% |
| 1.0014 | 6 | |
| 1.001 | 6 | |
| 1.0008 | 3 | 0.2% |
| 1.0006 | 6 | |
| 1.0004 | 9 | |
| 1.0003 | 2 | 0.1% |
| Distinct | 79 |
|---|---|
| Distinct (%) | 5.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 3.310399742 |
| Minimum | 2.86 |
|---|---|
| Maximum | 3.78 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 12.2 KiB |
Quantile statistics
| Minimum | 2.86 |
|---|---|
| 5-th percentile | 3.07 |
| Q1 | 3.21 |
| median | 3.31 |
| Q3 | 3.4 |
| 95-th percentile | 3.56 |
| Maximum | 3.78 |
| Range | 0.92 |
| Interquartile range (IQR) | 0.19 |
Descriptive statistics
| Standard deviation | 0.1443163017 |
|---|---|
| Coefficient of variation (CV) | 0.04359482629 |
| Kurtosis | 0.07326044267 |
| Mean | 3.310399742 |
| Median Absolute Deviation (MAD) | 0.09 |
| Skewness | 0.06090359956 |
| Sum | 5134.43 |
| Variance | 0.02082719494 |
| Monotonicity | Not monotonic |
Histogram with fixed size bins (bins=50)
| Value | Count | Frequency (%) |
| 3.36 | 56 | 3.6% |
| 3.3 | 56 | 3.6% |
| 3.26 | 51 | 3.3% |
| 3.38 | 48 | 3.1% |
| 3.39 | 48 | 3.1% |
| 3.29 | 46 | 3.0% |
| 3.32 | 44 | 2.8% |
| 3.34 | 43 | 2.8% |
| 3.28 | 42 | 2.7% |
| 3.31 | 39 | 2.5% |
| Other values (69) | 1078 |
| Value | Count | Frequency (%) |
| 2.86 | 1 | 0.1% |
| 2.88 | 2 | 0.1% |
| 2.89 | 2 | 0.1% |
| 2.92 | 1 | 0.1% |
| 2.93 | 3 | |
| 2.94 | 4 | |
| 2.98 | 4 | |
| 2.99 | 2 | 0.1% |
| 3 | 5 | |
| 3.01 | 3 |
| Value | Count | Frequency (%) |
| 3.78 | 2 | 0.1% |
| 3.72 | 2 | 0.1% |
| 3.71 | 3 | 0.2% |
| 3.69 | 4 | |
| 3.68 | 2 | 0.1% |
| 3.67 | 3 | 0.2% |
| 3.66 | 4 | |
| 3.63 | 3 | 0.2% |
| 3.62 | 4 | |
| 3.61 | 8 |
sulphates
Real number (ℝ)
| Distinct | 92 |
|---|---|
| Distinct (%) | 5.9% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | -0.6117173117 |
| Minimum | -2.114887349 |
|---|---|
| Maximum | 0.4857661269 |
| Zeros | 1 |
| Zeros (%) | 0.1% |
| Negative | 1497 |
| Negative (%) | 96.5% |
| Memory size | 12.2 KiB |
Quantile statistics
| Minimum | -2.114887349 |
|---|---|
| 5-th percentile | -1.157869371 |
| Q1 | -0.8350811533 |
| median | -0.6228176745 |
| Q3 | -0.3736894824 |
| 95-th percentile | -0.07544061908 |
| Maximum | 0.4857661269 |
| Range | 2.600653476 |
| Interquartile range (IQR) | 0.4613916709 |
Descriptive statistics
| Standard deviation | 0.3348132031 |
|---|---|
| Coefficient of variation (CV) | -0.5473332154 |
| Kurtosis | 0.3393083807 |
| Mean | -0.6117173117 |
| Median Absolute Deviation (MAD) | 0.2122634787 |
| Skewness | -0.02792970866 |
| Sum | -948.7735505 |
| Variance | 0.112099881 |
| Monotonicity | Not monotonic |
Histogram with fixed size bins (bins=50)
| Value | Count | Frequency (%) |
| -0.7376418385 | 68 | 4.4% |
| -0.678256673 | 68 | 4.4% |
| -0.8700438099 | 68 | 4.4% |
| -0.6228176745 | 59 | 3.8% |
| -0.8014051889 | 58 | 3.7% |
| -0.7689468332 | 54 | 3.5% |
| -0.8350811533 | 50 | 3.2% |
| -0.9063675399 | 49 | 3.2% |
| -0.707430378 | 49 | 3.2% |
| -0.5964582035 | 48 | 3.1% |
| Other values (82) | 980 |
| Value | Count | Frequency (%) |
| -2.114887349 | 1 | 0.1% |
| -1.765173064 | 2 | 0.1% |
| -1.617990077 | 4 | 0.3% |
| -1.550083192 | 3 | 0.2% |
| -1.424258905 | 5 | 0.3% |
| -1.365867732 | 8 | |
| -1.310211109 | 14 | |
| -1.257103091 | 12 | |
| -1.206374142 | 18 | |
| -1.157869371 | 19 |
| Value | Count | Frequency (%) |
| 0.4857661269 | 1 | |
| 0.478315085 | 2 | |
| 0.3774959193 | 1 | |
| 0.3544273194 | 1 | |
| 0.2623243004 | 2 | |
| 0.2515522306 | 1 | |
| 0.2347578696 | 1 | |
| 0.2171512617 | 2 | |
| 0.2049330136 | 1 | |
| 0.1792563921 | 1 |
| Distinct | 62 |
|---|---|
| Distinct (%) | 4.0% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 10.39115624 |
| Minimum | 8.4 |
|---|---|
| Maximum | 14 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 12.2 KiB |
Quantile statistics
| Minimum | 8.4 |
|---|---|
| 5-th percentile | 9.2 |
| Q1 | 9.5 |
| median | 10.1 |
| Q3 | 11 |
| 95-th percentile | 12.4 |
| Maximum | 14 |
| Range | 5.6 |
| Interquartile range (IQR) | 1.5 |
Descriptive statistics
| Standard deviation | 1.01453631 |
|---|---|
| Coefficient of variation (CV) | 0.09763459292 |
| Kurtosis | -0.1858881517 |
| Mean | 10.39115624 |
| Median Absolute Deviation (MAD) | 0.7 |
| Skewness | 0.7584740148 |
| Sum | 16116.68333 |
| Variance | 1.029283924 |
| Monotonicity | Not monotonic |
Histogram with fixed size bins (bins=50)
| Value | Count | Frequency (%) |
| 9.5 | 136 | 8.8% |
| 9.4 | 102 | 6.6% |
| 9.8 | 76 | 4.9% |
| 9.2 | 69 | 4.4% |
| 10 | 67 | 4.3% |
| 10.5 | 66 | 4.3% |
| 11 | 59 | 3.8% |
| 9.3 | 59 | 3.8% |
| 9.6 | 59 | 3.8% |
| 9.7 | 54 | 3.5% |
| Other values (52) | 804 |
| Value | Count | Frequency (%) |
| 8.4 | 2 | 0.1% |
| 8.5 | 1 | 0.1% |
| 8.7 | 2 | 0.1% |
| 9 | 27 | 1.7% |
| 9.05 | 1 | 0.1% |
| 9.1 | 22 | 1.4% |
| 9.2 | 69 | |
| 9.233333333 | 1 | 0.1% |
| 9.25 | 1 | 0.1% |
| 9.3 | 59 |
| Value | Count | Frequency (%) |
| 14 | 1 | 0.1% |
| 13.6 | 1 | 0.1% |
| 13.5 | 1 | 0.1% |
| 13.4 | 3 | 0.2% |
| 13.3 | 3 | 0.2% |
| 13.2 | 1 | 0.1% |
| 13.1 | 1 | 0.1% |
| 13 | 5 | 0.3% |
| 12.9 | 8 | |
| 12.8 | 16 |
quality
Real number (ℝ≥0)
| Distinct | 6 |
|---|---|
| Distinct (%) | 0.4% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 5.6344294 |
| Minimum | 3 |
|---|---|
| Maximum | 8 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 12.2 KiB |
Quantile statistics
| Minimum | 3 |
|---|---|
| 5-th percentile | 5 |
| Q1 | 5 |
| median | 6 |
| Q3 | 6 |
| 95-th percentile | 7 |
| Maximum | 8 |
| Range | 5 |
| Interquartile range (IQR) | 1 |
Descriptive statistics
| Standard deviation | 0.7958446396 |
|---|---|
| Coefficient of variation (CV) | 0.1412467143 |
| Kurtosis | 0.2488314966 |
| Mean | 5.6344294 |
| Median Absolute Deviation (MAD) | 1 |
| Skewness | 0.2504714688 |
| Sum | 8739 |
| Variance | 0.6333686903 |
| Monotonicity | Not monotonic |
Histogram with fixed size bins (bins=6)
| Value | Count | Frequency (%) |
| 5 | 667 | |
| 6 | 621 | |
| 7 | 190 | 12.3% |
| 4 | 49 | 3.2% |
| 8 | 16 | 1.0% |
| 3 | 8 | 0.5% |
| Value | Count | Frequency (%) |
| 3 | 8 | 0.5% |
| 4 | 49 | 3.2% |
| 5 | 667 | |
| 6 | 621 | |
| 7 | 190 | 12.3% |
| 8 | 16 | 1.0% |
| Value | Count | Frequency (%) |
| 8 | 16 | 1.0% |
| 7 | 190 | 12.3% |
| 6 | 621 | |
| 5 | 667 | |
| 4 | 49 | 3.2% |
| 3 | 8 | 0.5% |
Pearson's r
The Pearson's correlation coefficient (r) is a measure of linear correlation between two variables. It's value lies between -1 and +1, -1 indicating total negative linear correlation, 0 indicating no linear correlation and 1 indicating total positive linear correlation. Furthermore, r is invariant under separate changes in location and scale of the two variables, implying that for a linear function the angle to the x-axis does not affect r.To calculate r for two variables X and Y, one divides the covariance of X and Y by the product of their standard deviations.
Spearman's ρ
The Spearman's rank correlation coefficient (ρ) is a measure of monotonic correlation between two variables, and is therefore better in catching nonlinear monotonic correlations than Pearson's r. It's value lies between -1 and +1, -1 indicating total negative monotonic correlation, 0 indicating no monotonic correlation and 1 indicating total positive monotonic correlation.To calculate ρ for two variables X and Y, one divides the covariance of the rank variables of X and Y by the product of their standard deviations.
Kendall's τ
Similarly to Spearman's rank correlation coefficient, the Kendall rank correlation coefficient (τ) measures ordinal association between two variables. It's value lies between -1 and +1, -1 indicating total negative correlation, 0 indicating no correlation and 1 indicating total positive correlation.To calculate τ for two variables X and Y, one determines the number of concordant and discordant pairs of observations. τ is given by the number of concordant pairs minus the discordant pairs divided by the total number of pairs.
Phik (φk)
Phik (φk) is a new and practical correlation coefficient that works consistently between categorical, ordinal and interval variables, captures non-linear dependency and reverts to the Pearson correlation coefficient in case of a bivariate normal input distribution. There is extensive documentation available here. A simple visualization of nullity by column.
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
First rows
| fixed acidity | volatile acidity | citric acid | residual sugar | chlorides | free sulfur dioxide | total sulfur dioxide | density | pH | sulphates | alcohol | quality | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | 1.103306 | 0.70 | 0.00 | 0.450049 | 0.076 | 2.867882 | 3.861490 | 0.9978 | 3.51 | -0.801405 | 9.4 | 5 |
| 1 | 1.116875 | 0.88 | 0.00 | 0.572939 | 0.098 | 4.102521 | 4.686835 | 0.9968 | 3.20 | -0.476630 | 9.8 | 5 |
| 2 | 1.116875 | 0.76 | 0.04 | 0.530233 | 0.092 | 3.317116 | 4.421290 | 0.9970 | 3.26 | -0.546246 | 9.8 | 5 |
| 3 | 1.198255 | 0.28 | 0.56 | 0.450049 | 0.075 | 3.504207 | 4.550629 | 0.9980 | 3.16 | -0.737642 | 9.8 | 6 |
| 4 | 1.103306 | 0.70 | 0.00 | 0.450049 | 0.076 | 2.867882 | 3.861490 | 0.9978 | 3.51 | -0.801405 | 9.4 | 5 |
| 5 | 1.103306 | 0.66 | 0.00 | 0.423877 | 0.075 | 3.107333 | 4.056652 | 0.9978 | 3.51 | -0.801405 | 9.4 | 5 |
| 6 | 1.120087 | 0.60 | 0.06 | 0.360736 | 0.069 | 3.317116 | 4.529951 | 0.9964 | 3.30 | -1.206374 | 9.4 | 5 |
| 7 | 1.099721 | 0.65 | 0.00 | 0.164038 | 0.065 | 3.317116 | 3.292242 | 0.9946 | 3.39 | -1.157869 | 10.0 | 7 |
| 8 | 1.116875 | 0.58 | 0.02 | 0.473382 | 0.073 | 2.587814 | 3.113045 | 0.9968 | 3.36 | -0.768947 | 9.5 | 7 |
| 9 | 1.106811 | 0.50 | 0.36 | 0.746971 | 0.071 | 3.504207 | 5.212636 | 0.9978 | 3.35 | -0.251804 | 10.5 | 5 |
Last rows
| fixed acidity | volatile acidity | citric acid | residual sugar | chlorides | free sulfur dioxide | total sulfur dioxide | density | pH | sulphates | alcohol | quality | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 1541 | 1.072117 | 0.725 | 0.20 | 0.772217 | 0.073 | 4.341848 | 4.891621 | 0.99770 | 3.29 | -0.870044 | 9.2 | 5 |
| 1542 | 1.058737 | 0.550 | 0.15 | 0.423877 | 0.077 | 4.165264 | 3.896182 | 0.99314 | 3.32 | -0.220914 | 11.6 | 6 |
| 1543 | 1.011306 | 0.740 | 0.09 | 0.394331 | 0.089 | 3.413163 | 3.542844 | 0.99402 | 3.67 | -0.801405 | 11.6 | 6 |
| 1544 | 1.058737 | 0.510 | 0.13 | 0.530233 | 0.076 | 4.341848 | 4.056652 | 0.99574 | 3.42 | -0.336470 | 11.0 | 6 |
| 1545 | 1.080486 | 0.620 | 0.08 | 0.450049 | 0.068 | 4.284797 | 3.994882 | 0.99651 | 3.42 | -0.220914 | 9.5 | 6 |
| 1546 | 1.054038 | 0.600 | 0.08 | 0.473382 | 0.090 | 4.503448 | 4.171856 | 0.99490 | 3.45 | -0.737642 | 10.5 | 5 |
| 1547 | 1.039149 | 0.550 | 0.10 | 0.513157 | 0.062 | 4.835265 | 4.351412 | 0.99512 | 3.52 | -0.318618 | 11.2 | 6 |
| 1548 | 1.058737 | 0.510 | 0.13 | 0.530233 | 0.076 | 4.341848 | 4.056652 | 0.99574 | 3.42 | -0.336470 | 11.0 | 6 |
| 1549 | 1.039149 | 0.645 | 0.12 | 0.473382 | 0.075 | 4.503448 | 4.171856 | 0.99547 | 3.57 | -0.413072 | 10.2 | 5 |
| 1550 | 1.044250 | 0.310 | 0.47 | 0.660415 | 0.067 | 3.590783 | 4.115556 | 0.99549 | 3.39 | -0.522316 | 11.0 | 6 |
Most frequently occurring
| fixed acidity | volatile acidity | citric acid | residual sugar | chlorides | free sulfur dioxide | total sulfur dioxide | density | pH | sulphates | alcohol | quality | # duplicates | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 19 | 1.076354 | 0.460 | 0.24 | 0.394331 | 0.077 | 3.590783 | 3.861490 | 0.99480 | 3.39 | -0.678257 | 10.6 | 6 | 4 |
| 49 | 1.096052 | 0.360 | 0.46 | 0.494302 | 0.074 | 4.037594 | 4.171856 | 0.99534 | 3.40 | -0.177388 | 11.0 | 7 | 4 |
| 60 | 1.096052 | 0.695 | 0.13 | 0.473382 | 0.076 | 2.991877 | 3.235373 | 0.99546 | 3.29 | -0.870044 | 10.1 | 5 | 4 |
| 78 | 1.106811 | 0.510 | 0.02 | 0.394331 | 0.084 | 3.107333 | 3.751279 | 0.99538 | 3.36 | -0.870044 | 10.5 | 6 | 4 |
| 4 | 1.044250 | 0.500 | 0.00 | 0.277689 | 0.057 | 3.317116 | 3.542844 | 0.99448 | 3.36 | -1.257103 | 9.5 | 5 | 3 |
| 11 | 1.063313 | 0.640 | 0.21 | 0.423877 | 0.081 | 3.215430 | 3.751279 | 0.99689 | 3.59 | -0.522316 | 9.8 | 5 | 3 |
| 36 | 1.088455 | 0.650 | 0.02 | 0.494302 | 0.066 | 2.427186 | 3.496620 | 0.99720 | 3.47 | -0.499122 | 9.5 | 6 | 3 |
| 37 | 1.088455 | 0.690 | 0.07 | 0.559943 | 0.091 | 3.317116 | 3.292242 | 0.99572 | 3.38 | -0.678257 | 11.3 | 6 | 3 |
| 57 | 1.096052 | 0.630 | 0.00 | 0.450049 | 0.097 | 3.215430 | 3.994882 | 0.99675 | 3.37 | -0.737642 | 9.0 | 6 | 3 |
| 101 | 1.116875 | 0.600 | 0.26 | 0.473382 | 0.080 | 4.451077 | 5.531052 | 0.99622 | 3.21 | -0.944133 | 9.9 | 5 | 3 |